NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality

https://doi.org/10.1145/3749535

Zhao, Yiqin; Dasari, Mallesham; Guo, Tian (September 2025, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

High-quality environment lighting is essential for creating immersive mobile augmented reality (AR) experiences. However, achieving visually coherent estimation for mobile AR is challenging due to several key limitations in AR device sensing capabilities, including low camera FoV and limited pixel dynamic ranges. Recent advancements in generative AI, which can generate high-quality images from different types of prompts, including texts and images, present a potential solution for high-quality lighting estimation. Still, to effectively use generative image diffusion models, we must address two key limitations of content quality and slow inference. In this work, we design and implement a generative lighting estimation system called CleAR that can produce high-quality, diverse environment maps in the format of 360° HDR images. Specifically, we design a two-step generation pipeline guided by AR environment context data to ensure the output aligns with the physical environment's visual context and color appearance. To improve the estimation robustness under different lighting conditions, we design a real-time refinement component to adjust lighting estimation results on AR devices. To train and test our generative models, we curate a large-scale environment lighting estimation dataset with diverse lighting conditions. Through a combination of quantitative and qualitative evaluations, we show that CleAR outperforms state-of-the-art lighting estimation methods on both estimation accuracy, latency, and robustness, and is rated by 31 participants as producing better renderings for most virtual objects. For example, CleAR achieves 51% to 56% accuracy improvement on virtual object renderings across objects of three distinctive types of materials and reflective properties. CleAR produces lighting estimates of comparable or better quality in just 3.2 seconds---over 110X faster than state-of-the-art methods. Moreover, CleAR supports real-time refinement of lighting estimation results, ensuring robust and timely updates for AR applications.
more » « less
Free, publicly-accessible full text available September 3, 2026
Spatial Video Streaming on Apple Vision Pro XR Headset

https://doi.org/10.1145/3708468.3711878

Chen, Guodong; Wang, Sizhe; Chakareski, Jacob; Koutsonikolas, Dimitrios; Dasari, Mallesham (February 2025, ACM)

Free, publicly-accessible full text available February 26, 2026
StageAR: Markerless Mobile Phone Localization for AR in Live Events

https://doi.org/10.1109/VR58804.2024.00119

Jin, Tao; Wu, Shengxi; Dasari, Mallesham; Apicharttrisorn, Kittipat; Rowe, Anthony (March 2024, IEEE)

Full Text Available
RenderFusion: Balancing Local and Remote Rendering for Interactive 3D Scenes

https://doi.org/10.1109/ISMAR59233.2023.00046

Lu, Edward; Bharadwaj, Sagar; Dasari, Mallesham; Smith, Connor; Seshan, Srinivasan; Rowe, Anthony (October 2023, IEEE)

Full Text Available
Scaling VR Video Conferencing

https://doi.org/10.1109/VR55154.2023.00080

Dasari, Mallesham; Lu, Edward; Farb, Michael W.; Pereira, Nuno; Liang, Ivan; Rowe, Anthony (March 2023, IEEE Conference Virtual Reality and 3D User Interfaces)

Virtual Reality (VR) telepresence platforms are being challenged to support live performances, sporting events, and conferences with thousands of users across seamless virtual worlds. Current systems have struggled to meet these demands which has led to high-profile performance events with groups of users isolated in parallel sessions. The core difference in scaling VR environments compared to classic 2D video content delivery comes from the dynamic peer-to-peer spatial dependence on communication. Users have many pair-wise interactions that grow and shrink as they explore spaces. In this paper, we discuss the challenges of VR scaling and present an architecture that supports hundreds of users with spatial audio and video in a single virtual environment. We leverage the property of \textit{spatial locality} with two key optimizations: (1) a Quality of Service (QoS) scheme to prioritize audio and video traffic based on users' locality, and (2) a resource manager that allocates client connections across multiple servers based on user proximity within the virtual world. Through real-world deployments and extensive evaluations under real and simulated environments, we demonstrate the scalability of our platform while showing improved QoS compared with existing approaches.
more » « less
Full Text Available
L3BOU: Low Latency, Low Bandwidth, Optimized Super-Resolution Backhaul for 360-Degree Video Streaming

https://doi.org/10.1109/ISM52913.2021.00031

Sarkar, Ayush; Murray, John; Dasari, Mallesham; Zink, Michael; Nahrstedt, Klara (November 2021, IEEE International Symposium on Multimedia (ISM))

In recent years, streamed 360° videos have gained popularity within Virtual Reality (VR) and Augmented Reality (AR) applications. However, they are of much higher resolutions than 2D videos, causing greater bandwidth consumption when streamed. This increased bandwidth utilization puts tremendous strain on the network capacity of the cloud providers streaming these videos. In this paper, we introduce L3BOU, a novel, three-tier distributed software framework that reduces cloud-edge bandwidth in the backhaul network and lowers average end-to-end latency for 360° video streaming applications. The L3BOU framework achieves low bandwidth and low latency by leveraging edge-based, optimized upscaling techniques. L3BOU accomplishes this by utilizing down-scaled MPEG-DASH-encoded 360° video data, known as Ultra Low Resolution (ULR) data, that the L3BOU edge applies distributed super-resolution (SR) techniques on, providing a high quality video to the client. L3BOU is able to reduce the cloud-edge backhaul bandwidth by up to a factor of 24, and the optimized super-resolution multi-processing of ULR data provides a 10-fold latency decrease in super resolution upscaling at the edge.
more » « less
Full Text Available
Multiple Transmitter Localization under Time-Skewed Observations

https://doi.org/10.1109/DySPAN.2019.8935739

Ghaderibaneh, Mohammad; Dasari, Mallesham; Gupta, Himanshu (November 2019, 2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN))
null (Ed.)
Full Text Available
A Lightweight Multi-Section CNN for Lung Nodule Classification and Malignancy Estimation

https://doi.org/10.1109/JBHI.2018.2879834

Sahu, Pranjal; Yu, Dantong; Dasari, Mallesham; Hou, Fei; Qin, Hong (May 2019, IEEE Journal of Biomedical and Health Informatics)
null (Ed.)
Full Text Available
Impact of Device Performance on Mobile Internet QoE

Dasari, Mallesham; Vargas, Santiago; Bhattacharya, Arani; Balasubramanian, Aruna; Das, Samir R; Ferdman, Michael (October 2018, Proceedings of the Internet Measurement Conference 2018)

A large fraction of users in developing regions use relatively inexpensive, low-end smartphones. However, the impact of device capabilities on the performance of mobile Internet applications has not been explored. To bridge this gap, we study the QoE of three popular applications -- Web browsing, video streaming, and video telephony -- for different device parameters. Our results demonstrate that the performance of Web browsing is much more sensitive to low-end hardware than that of video applications, especially video streaming. This is because the video applications exploit specialized coprocessors/accelerators and thread-level parallelism on multi-core mobile devices. Even low-end devices are equipped with needed coprocessors and multiple cores. In contrast, Web browsing is largely influenced by clock frequency, but it uses no more than two cores. This makes the performance of Web browsing more vulnerable on low-end smartphones. Based on the lessons learned from studying video applications, we explore offloading Web computation to a coprocessor. Specifically, we explore the offloading of regular expression computation to a DSP coprocessor and show an improvement of 18% in page load time while saving energy by a factor of four.
more » « less
Full Text Available
In-Operando Tracking and Prediction of Transition in Material System using LSTM

https://doi.org/10.1145/3217197.3217204

Sahu, Pranjal; Yu, Dantong; Yager, Kevin; Dasari, Mallesham; Qin, Hong (March 2018, AI-Science'18: Proceedings of the 1st International Workshop on Autonomous Infrastructure for Science)
null (Ed.)
Full Text Available

Search for: All records